293 research outputs found

    Distinct genealogies for plasmids and chromosome

    Get PDF
    An earlier perspective on the diversity of conjugative elements in microbes [1] attempted to provide a broad audience with an introductory overview of the arcane biology of mobile genetic elements and their terminologies. It might well have been entitled "Plasmids, ICEs, IMEs, and Other Mobile Elements for Dummies," but common sense prevailed. This perspective introduces two related articles in the current issue of PLOS Genetics [2,3] and might have equally aptly been entitled "Antibiotic-Resistant Plasmids and Their Epidemiology for Dummies.

    How old are bacterial pathogens?

    Get PDF
    Only few molecular studies have addressed the age of bacterial pathogens that infected humans before the beginnings of medical bacteriology, but these have provided dramatic insights. The global genetic diversity of Helicobacter pylori, which infects human stomachs, parallels that of its human host. The time to the Most Recent Common Ancestor (tMRCA) of these bacteria approximates that of anatomically modern humans, i.e. at least 100,000 years, after calibrating the evolutionary divergence within H. pylori against major ancient human migrations. Similarly, genomic reconstructions of Mycobacterium tuberculosis, the cause of tuberculosis, from ancient skeletons in South America and mummies in Hungary support estimates of <6,000 years for the tMRCA of M. tuberculosis. Finally, modern global patterns of genetic diversity and ancient DNA studies indicate that during the last 5,000 years plague caused by Yersinia pestis has spread globally on multiple occasions from China and Central Asia. Such tMRCA estimates provide only lower bounds on the ages of bacterial pathogens, and additional studies are needed for realistic upper bounds on how long humans and animals have suffered from bacterial diseases

    Metagenomics of the modern and historical human oral microbiome with phylogenetic studies on Streptococcus mutans and Streptococcus sobrinus

    Get PDF
    We have recently developed bioinformatic tools to accurately assign metagenomic sequence reads to microbial taxa: SPARSE [1] for probabilistic, taxonomic classification of sequence reads, EToKi [2] for assembling and polishing genomes from short read sequences, and GrapeTree [3], a graphic visualizer of genetic distances between large numbers of genomes. Together, these methods support comparative analyses of genomes from ancient skeletons and modern humans [2,4]. Here we illustrate these capabilities with 784 samples from historical dental calculus, modern saliva and modern dental plaque. The analyses revealed 1591 microbial species within the oral microbiome. We anticipated that the oral complexes of Socransky et al. [5] would predominate among taxa whose frequencies differed by source. However, although some species discriminated between sources, we could not confirm the existence of the complexes. The results also illustrate further functionality of our pipelines with two species that are associated with dental caries, Streptococcus mutans and Streptococcus sobrinus. They were rare in historical dental calculus but common in modern plaque, and even more common in saliva. Reconstructed draft genomes of these two species from metagenomic samples in which they were abundant were combined with modern public genomes to provide a detailed overview of their core genomic diversity

    Formal comment to Pettengill : the time to most recent common ancestor does not (usually) approximate the date of divergence

    Get PDF
    In 2013 Zhou et al. concluded that Salmonella enterica serovar Agona represents a genetically monomorphic lineage of recent ancestry, whose most recent common ancestor existed in 1932, or earlier. The Abstract stated ‘Agona consists of three lineages with minimal mutational diversity: only 846 single nucleotide polymorphisms (SNPs) have accumulated in the non-repetitive, core genome since Agona evolved in 1932 and subsequently underwent a major population expansion in the 1960s.’ These conclusions have now been criticized by Pettengill, who claims that the evolutionary models used to date Agona may not have been appropriate, the dating estimates were inaccurate, and the age of emergence of Agona should have been qualified by an upper limit reflecting the date of its divergence from an outgroup, serovar Soerenga. We dispute these claims. Firstly, Pettengill’s analysis of Agona is not justifiable on technical grounds. Secondly, an upper limit for divergence from an outgroup would only be meaningful if the outgroup were closely related to Agona, but close relatives of Agona are yet to be identified. Thirdly, it is not possible to reliably date the time of divergence between Agona and Soerenga. We conclude that Pettengill’s criticism is comparable to a tempest in a teapot

    Accurate reconstruction of bacterial pan- and core genomes with PEPPAN

    Get PDF
    Bacterial genomes can contain traces of a complex evolutionary history, including extensive homologous recombination, gene loss, gene duplications and horizontal gene transfer. In order to reconstruct the phylogenetic and population history of a set of multiple bacteria, it is necessary to examine their pangenome, the composite of all the genes in the set. Here we introduce PEPPAN, a novel pipeline that can reliably construct pangenomes from thousands of genetically diverse bacterial genomes that represent the diversity of an entire genus. PEPPAN outperforms existing pangenome methods by providing consistent gene and pseudogene annotations extended by similarity-based gene predictions, and identifying and excluding paralogs by combining tree- and synteny-based approaches. The PEPPAN package additionally includes PEPPAN_parser, which implements additional downstream analyses including the calculation of trees based on accessory gene content or allelic differences between core genes. In order to test the accuracy of PEPPAN, we implemented SimPan, a novel pipeline for simulating the evolution of bacterial pangenomes. We compared the accuracy and speed of PEPPAN with four state-of-the-art pangenome pipelines using both empirical and simulated datasets. PEPPAN was more accurate and more specific than any of the other pipelines and was almost as fast as any of them. As a case study, we used PEPPAN to construct a pangenome of ~40,000 genes from 3052 representative genomes spanning at least 80 species of Streptococcus. The resulting gene and allelic trees provide an unprecedented overview of the genomic diversity of the entire Streptococcus genus

    Neutral genomic microevolution of a recently emerged pathogen, salmonella enterica serovar agona

    Get PDF
    Salmonella enterica serovar Agona has caused multiple food-borne outbreaks of gastroenteritis since it was first isolated in 1952. We analyzed the genomes of 73 isolates from global sources, comparing five distinct outbreaks with sporadic infections as well as food contamination and the environment. Agona consists of three lineages with minimal mutational diversity: only 846 single nucleotide polymorphisms (SNPs) have accumulated in the non-repetitive, core genome since Agona evolved in 1932 and subsequently underwent a major population expansion in the 1960s. Homologous recombination with other serovars of S. enterica imported 42 recombinational tracts (360 kb) in 5/143 nodes within the genealogy, which resulted in 3,164 additional SNPs. In contrast to this paucity of genetic diversity, Agona is highly diverse according to pulsed-field gel electrophoresis (PFGE), which is used to assign isolates to outbreaks. PFGE diversity reflects a highly dynamic accessory genome associated with the gain or loss (indels) of 51 bacteriophages, 10 plasmids, and 6 integrative conjugational elements (ICE/IMEs), but did not correlate uniquely with outbreaks. Unlike the core genome, indels occurred repeatedly in independent nodes (homoplasies), resulting in inaccurate PFGE genealogies. The accessory genome contained only few cargo genes relevant to infection, other than antibiotic resistance. Thus, most of the genetic diversity within this recently emerged pathogen reflects changes in the accessory genome, or is due to recombination, but these changes seemed to reflect neutral processes rather than Darwinian selection. Each outbreak was caused by an independent clade, without universal, outbreak-associated genomic features, and none of the variable genes in the pan-genome seemed to be associated with an ability to cause outbreaks

    BlastFrost : fast querying of 100,000s of bacterial genomes in Bifrost graphs

    Get PDF
    BlastFrost is a highly efficient method for querying 100,000s of genome assemblies, building on Bifrost, a dynamic data structure for compacted and colored de Bruijn graphs. BlastFrost queries a Bifrost data structure for sequences of interest, and extracts local subgraphs, enabling the identification of the presence or absence of individual genes or single nucleotide sequence variants. We show two examples using Salmonella genomes, finding within minutes the presence of genes in the SPI-2 pathogenicity island in a collection of 926 genomes; and identifying single nucleotide polymorphisms associated with fluoroquinolone resistance in three genes among 190, 209 genomes. BlastFrost is available at https://github.com/nluhmann/BlastFrost

    The EnteroBase user's guide, with case studies on Salmonella transmissions, Yersinia pestis phylogeny and Escherichia core genomic diversity

    Get PDF
    EnteroBase is an integrated software environment which supports the identification of global population structures within several bacterial genera that include pathogens. Here, we provide an overview on how EnteroBase works, what it can do, and its future prospects. EnteroBase has currently assembled more than 300,000 genomes from Illumina short reads from Salmonella, Escherichia, Yersinia, Clostridiodes, Helicobacter, Vibrio, and Moraxella, and genotyped those assemblies by core genome Multilocus Sequence Typing (cgMLST). Hierarchical clustering of cgMLST sequence types allows mapping a new bacterial strain to predefined population structures at multiple levels of resolution within a few hours after uploading its short reads. Case study 1 illustrates this process for local transmissions of Salmonella enterica serovar Agama between neighboring social groups of badgers and humans. EnteroBase also supports SNP calls from both genomic assemblies and after extraction from metagenomic sequences, as illustrated by case study 2 which summarizes the microevolution of Yersinia pestis over the last 5,000 years of pandemic plague. EnteroBase can also provide a global overview of the genomic diversity within an entire genus, as illustrated by case study 3 which presents a novel, global overview of the population structure of all of the species, subspecies and clades within Escherichia

    Mismatch induced speciation in Salmonella: model and data

    Get PDF
    In bacteria, DNA sequence mismatches act as a barrier to recombination between distantly related organisms and can potentially promote the cohesion of species. We have performed computer simulations which show that the homology dependence of recombination can cause de novo speciation in a neutrally evolving population once a critical population size has been exceeded. Our model can explain the patterns of divergence and genetic exchange observed in the genus Salmonella, without invoking either natural selection or geographical population subdivision. If this model was validated, based on extensive sequence data, it would imply that the named subspecies of Salmonella enterica correspond to good biological species, making species boundaries objective. However, multilocus sequence typing data, analysed using several conventional tools, provide a misleading impression of relationships within S. enterica subspecies enterica and do not provide the resolution to establish whether new species are presently being formed

    The role of China in the global spread of the current cholera pandemic

    Get PDF
    Epidemics and pandemics of cholera, a severe diarrheal disease, have occurred since the early 19th century and waves of epidemic disease continue today. Cholera epidemics are caused by individual, genetically monomorphic lineages of Vibrio cholerae: the ongoing seventh pandemic, which has spread globally since 1961, is associated with lineage L2 of biotype El Tor. Previous genomic studies of the epidemiology of the seventh pandemic identified three successive sub-lineages within L2, designated waves 1 to 3, which spread globally from the Bay of Bengal on multiple occasions. However, these studies did not include samples from China, which also experienced multiple epidemics of cholera in recent decades. We sequenced the genomes of 71 strains isolated in China between 1961 and 2010, as well as eight from other sources, and compared them with 181 published genomes. The results indicated that outbreaks in China between 1960 and 1990 were associated with wave 1 whereas later outbreaks were associated with wave 2. However, the previously defined waves overlapped temporally, and are an inadequate representation of the shape of the global genealogy. We therefore suggest replacing them by a series of tightly delineated clades. Between 1960 and 1990 multiple such clades were imported into China, underwent further microevolution there and then spread to other countries. China was thus both a sink and source during the pandemic spread of V. cholerae, and needs to be included in reconstructions of the global patterns of spread of cholera
    corecore